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Abstract 

We consider the problem of estimating the unknown response function in the mul- 
tichannel deconvolution model with long-range dependent Gaussian errors. We do 
not limit our consideration to a specific type of long-range dependence rather we as- 
sume that the errors should satisfy a general assumption in terms of the smallest 
and larger eigenvalues of their covariance matrices. We derive minimax lower bounds 
for the quadratic risk in the proposed multichannel deconvolution model when the 
response function is assumed to belong to a Besov ball and the blurring function is 
assumed to possess some smoothness properties, including both regular-smooth and 
super-smooth convolutions. Furthermore, we propose an adaptive wavelet estimator 
of the response function that is asymptotically optimal (in the minimax sense), or 
near-optimal within a logarithmic factor, in a wide range of Besov balls. It is shown 
that the optimal convergence rates depend on the balance between the smoothness 
parameter of the response function, the kernel parameters of the blurring function, the 
long memory parameters of the errors, and how the total number of observations is 
distributed among the total number of channels. Some examples of inverse problems 
in mathematical physics where one needs to recover initial or boundary conditions on 
the basis of observations from a noisy solution of a partial differential equation are 
used to illustrate the application of the theory we developed. The optimal convergence 
rates and the adaptive estimators we consider extend the ones studied by Pensky and 
Sapatinas (2009, 2010) for independent and identically distributed Gaussian errors to 
the case of long-range dependent Gaussian errors. 

AMS (2000) Subject Classifications: 62G05 (primary), 62G08, 35J05, 35K05, 
35L05 (secondary) 

Keywords and Phrases: adaptivity, Besov spaces, block thresholding, deconvolu- 
tion, Fourier analysis, functional data, long-range dependence, Meyer wavelets, mini- 
max estimators, multichannel deconvolution, partial differential equations, stationary 
sequences, wavelet analysis. 
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1 Introduction 



We consider the estimation problem of the unknown response function /(•) 6 L 2 (T) from 
observations y(ui,U) driven by 

y{u h ti) = J g(ui,t i -x)f(x)dx + Z li , I = 1, 2, . . . , M, i = 1, 2, . . . , N, (1.1) 

where m G U = [a, b], < a < b < oo, T = [0, 1], ti = i/N, and the errors £n are Gaussian 
random variables, independent for different Vs, but dependent for different z's. 

Denote the total number of observations n = ATM and assume, without loss of gen- 
erality, that N = 2 J for some integer J > 0. For each I = 1,2, ... , M, let £^ be a Gaussian 
vector with components fo, i = 1, 2, . . . , JV, and let £W : = Cov(£®) := E[gW(£W) T ] be 
its covariance matrix. 

Assumption Al: For each / = 1,2, ... ,M, S^' satisfies the following condition: 
there exist constants i^i and (0 < K\ < K2 < 00), independent of / and N, such that, 
for each 1 = 1,2,..., M, 

K!N 2dl < A min (E«) < A max (S«) < K 2 iV M; , 0<^<l/2, (1.2) 

where A m i n and A max are the smallest and the largest eigenvalues of (the Toeplitz matrix) 
S". (Here, and in what follows, " T " denotes the transpose of a vector or a matrix.) 

Assumption Al is valid when, for each I = 1,2,... ,M, £^ is a second-order sta- 
tionary Gaussian sequence with spectral density satisfying certain assumptions. We shall 
elaborate on this issue in Section [2j Note that, in the case of independent errors, for 
each I = 1,2,... ,M, £^ is proportional to the identity matrix and that d\ = 0. In this 
case, the multichannel deconvolution model (ll.ip reduces to the one with independent 
and identically distributed Gaussian errors. In a view of (jl.ip . the limit situation d\ = 0, 
I = 1,2, ... , M, can be thought of as the standard multichannel deconvolution model de- 
scribed in Pensky and Sapatinas (2009, 2010). 

Model (jl.ip can also be thought of as the discrete version of a model referred to as 
the functional deconvolution model by Pensky and Sapatinas (2009, 2010). The functional 
deconvolution model has a multitude of applications. In particular, it can be used in a 
number of inverse problems in mathematical physics where one needs to recover initial 
or boundary conditions on the basis of observations from a noisy solution of a partial 
differential equation. For instance, the problem of recovering the initial condition for 
parabolic equations based on observations in a fixed-time trip was first investigated in 
Lattes and Lions (1967), and the problem of recovering the boundary condition for elliptic 
equations based on observations in an interval domain was studied in Golubev and Khas- 
minskii (1999) and Golubev (2004). 
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In the case when a = b, the functional deconvolution model reduces to the stan- 
dard deconvolution model. This model has been the subject of a great array of research 
papers since late 1980s, but the most significant contribution was that of Donoho (1995) 
who was the first to device a wavelet solution to the problem. This has attracted the 
attention of a good deal of researchers, see, e.g., Abramovich and Silverman (1998), 
Kalifa and Mallat (2003), Donoho and Raimondo (2004), Johnstone and Raimondo (2004), 
Johnstone, Kerkyacharian, Picard and Raimondo (2004), Kerkyacharian, Picard and Rai- 
mondo (2007). (For related results on the density deconvolution problem, we refer to, e.g., 
Pensky and Vidakovic (1999), Walter and Shen (1999), Fan and Koo (2002).) 

In the multichannel deconvolution model studied by Pensky and Sapatinas (2009, 
2010), as well as in the very current extension of their results to derivative estimation 
by Navarro et al. (2013), it is assumed that errors are independent and identically dis- 
tributed Gaussian random variables. However, empirical evidence has shown that even at 
large lags, the correlation structure in the errors can decay at a hyperbolic rate, rather 
than an exponential rate. To account for this, a great deal of papers on long-range depen- 
dence (LRD) have been developed. The study of LRD (also called long memory) has a 
number of applications, as it can be reflected by the very large number of articles having 
LRD or long memory in their titles, in areas such as climate study, DNA sequencing, 
econometrics, finance, hydrology, internet modeling, signal and image processing, physics 
and even linguistics. Other applications can be found in Beran (1992, 1994), Beran et al. 
(2013) and Doukhan et al. (2003). 

Although quite a few LRD models have been considered in the regression estimation 
framework, very little has been done in the standard deconvolution model. The density 
deconvolution set up has also witnessed some shift towards analyzing the problem for de- 
pendent processes. The argument behind that was that a number of statistical models, 
such as non-linear GARCH and continuous-time stochastic volatility models, can be looked 
at as density deconvolution models if we apply a simple logarithmic transformation, and 
thus there is need to account for dependence in the data. This started by Van Zanten et 
al. (2008) who investigated wavelet based density deconvolution studied by Pensky and 
Vidakovic (1999) with a relaxation to weakly dependent processes. Comte et al. (2008) 
analyzed another adaptive estimator that was proposed earlier but under the assump- 
tion that the sequence is strictly stationary but not necessarily independent. However, 
it was Kulik (2008), who considered the density deconvolution for LRD and short-range 
dependent (SRD) processes. However, Kulik (2008) did not considered nonlinear wavelet 
estimators but dealt instead with linear kernel estimators. 

In nonparametric regression estimation, ARIMA-type models for the errors were 
analyzed in Cheng and Robinson (1994), with error terms of the form cr(xj, In Csorgo 
and Mielniczuk (2000), the error terms were modeled as infinite order moving averages pro- 
cesses. Mielniczuk and Wu (2004) investigated another form of LRD, with the assumption 
that Xi and £j are not necessarily independent for the same i. ARIMA-type error models 
were also considered in Kulik and Raimondo (2009). In the standard deconvolution model, 
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and using a maxiset approach, Wishart (2012) applied a fractional Brownian motion to 
model the presence of LRD, while Wang (2012) used a minimax approach to study the 
problem of recovering a function / from a more general noisy linear transformation where 
the noise is also a fractional Brownian motion. 

The objective of this paper is to study the multichannel deconvolution model from 
a minimax point of view, with the relaxation that errors exhibit LRD. We do not limit 
our consideration to a specific type of LRD: the only restriction is that the errors should 
satisfy Assumption Al. In particular, we derive minimax lower bounds for the L 2 -risk in 
model (jl.ip under Assumption Al when /(•) is assumed to belong to a Besov ball and 
g(-,-) has smoothness properties similar to those in Pensky and Sapatinas (2009, 2010), 
including both regular-smooth and super-smooth convolutions. In addition, we propose 
an adaptive wavelet estimator for /(•) and show that such estimator is asymptotically op- 
timal (or near-optimal within a logarithmic factor) in the minimax sense, in a wide range 
of Besov balls. We prove that the convergence rates of the resulting estimators depend 
on the balance between the smoothness parameter (of the response function /(•)), the 
kernel parameters (of the blurring function g(-,-)), and the long memory parameters di, 

1 = 1, 2 . . . , M (of the error sequence Since the parameters d\ depend on the values 
of /, the convergence rates have more complex expressions than the ones obtained in Kulik 
and Raimondo (2009) when studying nonparametric regression estimation with ARIMA- 
type error models. The convergence rates we derive are more similar in nature to those in 
Pensky and Sapatinas (2009, 2010). In particular, the convergence rates depend on how 
the total number n = NM of observations is distributed among the total number M of 
channels. As we illustrate in two examples, convergence rates are not affected by long 
range dependence in case of super-smooth convolutions, however, the situation changes in 
regular cases. 

The paper is organized as follows. Section[2]discusses stationary sequences with LRD 
errors, justifies Assumption Al and provides illustrative examples of stationary sequences 
satisfying this assumption. Section [3] describes the construction of the suggested wavelet 
estimator of /(•)• Section [4] derives minimax lower bounds for the L 2 -risk for observations 
from model Section [5] proves that the suggested wavelet estimator is adaptive 

and asymptotically optimal (in the minimax sense) or near-optimal within a logarithmic 
factor, in a wide range of Besov balls. Section [6] presents examples of inverse problems 
in mathematical physics where one needs to recover initial or boundary conditions on the 
basis of observations from a noisy solution of a partial differential equation to illustrate 
the application of the theory we developed. Section [7] concludes with a brief discussion. 
Section [8] contains the proofs of the theoretical results obtained in earlier sections. 

2 Stationary Sequences with Long-Range Dependence 

In this section, for simplicity of exposition, we consider one sequence of errors : j = 
1,2,...}. Assume that {£j : j = 1, 2, . . .} is a second-order stationary sequence with 
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covariance function j^(k) := j(k), k = 0, dbl, ±2, .... The spectral density is defined as 

^ oo 

a d X ) := a ( A ) := tt ^2 li k ) ex P(-ik\) , A G [-vr,7r]. 



k=— oo 



On the other hand, the inverse transform which recovers 7(fc), k = 0,±1,±2, from 
a(A), A G [— 7r, 7r], is given by 



l(k) 



e ikX a(X)dX, fc = 0,±l,±2, 



under the assumption that the spectral density a(A), A G [— 7r,7r], is squared-integrable. 



Let I] = py(j — k)W k=1 be the covariance matrix of (£i, . . . ,£n)- Define X = {x G 
x*x = 1}, where x* is the complex-conjugate of x. Since S is Hermitian, one has 

A min (£) = inf (x*£x) and A max (S) = sup (x'Sx) . (2.1) 
With the definitions introduced above, 



x*£x = x* 7 (j - fc)x = / 

a 7_i J -it 



j,k=l 



N 

E 

3=1 



-ijX 



a(X)d\. 



Note that, by the Parseval identity, the function h(X) = Y2j=i x j e ^ X 
belongs to the set 



h : h symmetric, |/i|oo < / h(X)dX = 2ir 



(2.2) 



A G [-7r,7rl 



Let d G [0, 1/2). Consider the following class of spectral densities 

T d = [a : o(A) = |A|- M a*(A), < C min < |a*(A)| < C mBX < oo, A G [-tt, tt]} . (2.3) 

Below we provide two examples of second-order stationary sequences such that their 
spectral densities a(A), A G [— vr,7r], belong to the class described in f|2 . 3|> . 

Fractional ARIMA(0, d, 0). Let {£j : j = 1,2,...} be the second-order station- 
ary sequence 

oo 

£j = ^ 0"wJ]j—mt 
m=0 

where r/j are uncorrelated, zero-mean, random variables, := Var(r/j) < oo, and 

r(i - d) 



a m = (-l) r 



-d 

m 



(-1) 7 



r(m + l)r(l -d-m) 



with d G [0, 1/2). Then, a m , m = 0, 1, . . ., are the coefficients in the power-series repre- 
sentation 

oo 

A(z) := (1 - z)- d := £ a m z m . 

m=0 

Therefore, the spectral density a(A), A G [— tt, tt], of {£j : j = 1, 2, . . .}, is given by 

_2 o _2 o j _2 _.2 



a ( A ) = fl A(e- iA ) " = ^ 1 - e" iA ™ = ^ |2(1 - cos A)|- d ~ ^ |Ap 2d (A ->■ 0). 



2tt 



2 erf 



2tt 



2^ cr„ . at 



2vr 1 v /! 2vr 



Hence, the sequence {£j : j = 1, 2, . . .} has spectral density a(A), A G [— 7T, tt], that belongs 
to the class described in (|2.3|) . (The sequence {£j : j = 1, 2, . . .} is called the fractional 
ARIMA(0,d,0) time series.) 

Fractional Gaussian Noise. Assume that Bjj{u), u G [0, oo], is a fractional Brow- 
nian motion with the Hurst parameter H G [1/2,1). Define the second-order stationary 
sequence £j = Bn(j) — Bn(j — 1), j = 1, 2, . . . . Its spectral density a(A), A G [—it, tt], is 
given by (see, e.g., [14], p. 222) 



a(A) = a\27r)- 2H - 2 T(2H + 1) sin(^ff)4sin 2 (A/2) x \k + (A/2^)T 2i/ - 1 , 

fc=— oo 

and, hence, 

a(A) = — r(2i/ + l)sin(^F)A 1 - 2H (A -> 0). 

7T 

Hence, the sequence {£j : j = 1, 2, . . .} has spectral density a(A), A G [— 7r, 7r], that belongs 
to class J-d with d = H — 1/2. (The sequence {£j : j = 1, 2, . . .} is called the fractional 
Gaussian noise.) 

It follows from f|2.3j) that, for a G Jrf, one has a(A) ~ |A|~ 2d (A — > 0). It also turns 
out that the condition a G J-d, d G [0, 1/2), implies that all eigenvalues of the covariance 
matrix S are of asymptotic order N 2d (N — > oo). In particular, the following lemma is 
true. 

Lemma 1 Assume that {£j : j = 1,2,...} is a second-order stationary sequence with 
spectral density a G Td, d G [0, 1/2). Then, for some constants K\& o-nd K 2 d (0 < Kid — 
-K"2d < oo), that depend on d only, 

K ld N 2d < A min (S) < A max (S) < K 2d N 2d . 

Remark 1 If d = 0, then J-d is the class of spectral densities a(A) that are bounded away 
from and oo for all A G [— vr, tt]. In particular, the corresponding second-order stationary 
sequences {£j : j = 1,2,...} are weakly dependent. Then, the statement of Lemma [J 
reduces to a result in Grenander and Szego fTT] , Section 5.2. 
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It follows immediately from Lemma[Uthat if, for each I = 1,2, ... , M, ^ ' is a second- 
order stationary Gaussian sequence with spectral density ai £ J 7 ^, di S [0, 1/2), that £^ 
are independent for different Vs, and that dfs are uniformly bounded, then Assumption 
Al holds. 

Corollary 1 For each I = 1,2, . . . , M , let £^ be a second-order stationary Gaussian 
sequence with spectral density a/ G Td v di € [0, 1/2). We assume that £W are independent 
for different Vs. Let d[, I = 1,2, .. . ,M, be uniformly bounded, i.e., there exists d* (0 < 
d* < 1/2) such that, for each I = 1, 2, . . . ,M, 

0<di<d*<l/2. (2.4) 

Then, Assumption Al holds. 



3 The Estimation Algorithm 

In what follows, (•,•) denotes the inner product in R . We also denote the complex- 
conjugate of a G C by a, the discrete Fourier basis on the interval T by e m (ti) = e - j27rmi * 5 
ti = i/N, i = 1, 2, . . . , N, m = 0, ±1, ±2, . . ., and the complex-conjugate of the matrix A 
by A*. 

Recall the multichannel deconvolution model (jl.ip . Denote 

h(u h U)= f g(u h ti-x)f(x)dx, 1 = 1,2,... ,M, i = 1, 2, . . . , N. 
Jt 

Then, equation (jl.ip can be rewritten as 

y(ui,t l ) = h(u l ,t i ) + Cu, l = l,2,...,M, i = l,2,...,N. (3.1) 

For each / = 1,2,..., M, let h m (ui) = {e m , h(u h •)), y m {u{) = {e m ,y(u u -)), z lm = 
(e m ,£ W ), 9m{ui) = (e m ,g(ui,-)) and f m = {e m , f) be the discrete Fourier coefficients 
of the R N vectors h(ui,U), y(u h ti), £ K , g(ui,U) and f(ti), i = 1,2, ...,7V, respectively. 
Then, applying the discrete Fourier transform to (13. ID . one obtains, for any ui £ U, 
1 = 1,2,... ,M, 

y m {ui) = 9rn{ui)frn + N~ 1/2 Zi m (3.2) 

and 

h m {ui) = g m (ui)f m . (3.3) 



Multiplying both sides of (|3.2p by iV 2d 'g m (ui), and adding them together, we obtain 
the following estimator of f m 



(3.4) 
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Let (/?*(•) and ip*(-) be the Meyer scaling and mother wavelet functions, respectively, 
defined on the real line (see, e.g., Meyer (1992)), and obtain a periodized version of Meyer 
wavelet basis as in Johnstone et al. (2004), i.e., for j > and k = 0, 1, . . . , 2 J — 1, 

(p . k Q c ) = ^2 2 j/ V (2 j (x + i) - k) , ifi jk (x) = ^2 2 j / 2 ip*(2P(x + i) — k), x G T. 

Following Pensky and Sapatinas (2009, 2010), using the periodized Meyer wavelet basis 
described above, for some jo > 0, expand /(•) G L 2 (T) as 

2*0-1 oo 2^-1 

/(*) = E W + EE b jk i> jk (t), t G T. (3.5) 

fc=o i=io fc=o 

Furthermore, by Plancherel's formula, the scaling coefficients, a,j ok = (/, <Pj k), and the 
wavelet coefficients, bj k = (/, V'jfc), of /(') can be represented as 

^iofc = ^ ] frn^Pmjoki bj k = ^ ^ fm' t Pmjki (3-6) 

where C io = {m : v? mjofc / 0} and, for any j > j , 

= {m : *p mjk + 0} C 2tt/3[-2-? +2 , -2*] U [2^>'+ 2 ]. 

(Note that the cardinality \Cj\ of the set Cj is |Cj| = Att2 3 , see, e.g., Johnstone e£ a/. 
(2004).) Estimates of aj ok and are readily obtained by substituting f m in (j3.6j) with 
i-e., ^ 

Q'jok = ^ ] fm^Prnjoki bj k = ^ ^ fmlpmjk- (^-7) 

We now construct a (block thresholding) wavelet estimator of /(•), suggested by 
Pensky & Sapatinas (2009, 2010). For this purpose, we divide the wavelet coefficients at 
each resolution level into blocks of length Inn. Let Aj and Uj r be the following sets of 
indices 

Aj = {r\r= 1,2,... ,27 Inn} , 
U jr = {k\ fc = 0,l,...,2 J '-l; (r- l)lnn < k < rlnra- l} . 

Denote 

k £ C/j 7^ k £ C/ ^ j- 

Finally, for any jo > 0, the (block thresholding) wavelet estimator f n {-) of /(•) is con- 
structed as 

2*>-l J-l 

j n (t) = ajok<Pj k(t) + hkK\B jr \ > Xj)^ jk (t), t G T, (3.9) 

k=0 j=jo r£Aj k £Uj r 
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where 1(A) is the indicator function of the set A, and the resolution levels jo and J and 
the thresholds \j will be defined in Section [5j 

In what follows, the symbol C is used for a generic positive constant, independent 
of n, while the symbol K is used for a generic positive constant, independent of m, n, M 
and u\, U2, ■ ■ ■ , um- Either of C or K may take different values at different places. 



4 Minimax Lower Bounds for the L 2 -Risk 

Denote 

s' = s + l/2-l/p, s* = s + 1/2 - 1/p', p' = min{p,2}. (4.1) 

Assume that the unknown response function /(•) belongs to a Besov ball Bp q (A) of radius 
A > 0, so that the wavelet coefficients aj k and bjk defined in (13. 6p satisfy the following 
relation 



'2^0-1 



2 3 -l 



f G L 2 (U) : ^2 \ a j k\ P + <A 



\ 



3=30 



k=0 




(4.2) 

Below, we construct minimax lower bounds for the (quadratic) L 2 -risk. For this purpose, 
we define the minimax L 2 -risk over the set V C L 2 (T) as 

Rn(V) = mf supE||/-/|| 2 , 
/ /ev 

where \\g\\ is the L 2 -norm of a function g(-) and the infimum is taken over all possible 
estimators /(•) (measurable functions taking their values in a set containing V) of /(■)> 
based on observations from model (jl.ip ). 



For M = M n and N = n/M n , denote 



T K (m, n) = M-^N- 2 ^' \g m (ui)\ 2 \ k = 1 or 2 or 4, 



l=i 



and 



A K (j,n) = \Cj\ 1 ^2 T K (m,n) [ri(m,n)] 2k , k = 1 or 2. 



(4.3) 



(4.4) 



meC,- 



The expression Ti(m,n) appears in both the lower and the upper bounds for the 
L 2 -risk. Hence, we impose the following assumption: 

Assumption A2: For some constants v\, 1^2, Ai, A2 G M, 01,02 > (Ai,A2 > if 
ai = 02 = 0, v\ = V2 = 0) and K^,K4,f3 > 0, independent of m and n, and for some 
sequence e n > 0, independent of m, one has 



K 3 e n |m|- 2iyi (ln|m|)~ Al e- ai|m| ^ < ri(m,n) < K 4 e n |m|~ 2i/2 (ln |m|)- A2 e - a2|m| ^ , (4.5) 



9 



where either a.\a 2 ^ or a\ = a 2 = and v\ = v 2 = v > 0. The sequence e n in (|4.5f) is 
such that 

n* = ne n —^00 (n — >■ 00). (4-6) 
Under Assumptions Al and A2, the following statement is true. 

Theorem 1 Let Assumptions Al and A2 hold. Let {4>j ,k(')> k(')} be the periodic Meyer 
wavelet basis discussed in Section^ Let s > max(0, 1/p — 1/2), 1 < p < 00, 1 < q < 00 
and A > 0. Then, as n — )■ oo, 



fl„(£',(A)) > { 



2s 2sA 2 

C(n*) 2 S +2r/+i (lnn*) 2 ^ 2 ^ 1 , if «i = 02 = 0, ^(2-p)<ps*, 

2s* 

' v 'J 



Inn* \ 2s*+2k 
n* 



2s* A 9 



C(lnn* 



(In n*) 2s *+ 2 ", if ai = a 2 = 0, z^(2-p)>ps*, 
if a\a 2 7^ 0. 



(4.7) 



5 Minimax Upper Bounds for the L 2 -Risk 

Let f n { ) be the (block thresholding) wavelet estimator defined by (|3.9p . Choose now jo 
and J such that 



2^0= Inn*, 2 



J 



(n*) 2 "+i, if «i = a 2 = 0, 



2 J0 = — 



In n* \ f> 



8vr V 2a 



2 K 



if ct\a > 0. 



(5.1) 
(5.2) 



(Since jo > J — 1 when «ia > 0, the estimator (|3.9p only consists of the first (linear) part 
and, hence, Xj does not need to be selected in this case.) Set, for some constant u > 0, 
large enough, 

Xj = p 2 (n*)- 1 ln(n*) 2 2 ^ j Al , if a x = a 2 = 0. (5.3) 

Note that the choices of jo, J and Xj are independent of the parameters, s, p, q and A of the 
Besov ball Bp <q (A); hence, the estimator (|3,9p is adaptive with respect to these parameters. 

Denote (x) + = max(0, x), 



p(2s+2u+l) ' 11 P) ^ P b > 

if u{2-p)=ps\ 
0, if z^(2 -p) > ps*. 

Assume that, in the case of a\ = a 2 = 0, the sequence e n is such that 

— h\ Inn < ln(e n ) < h 2 Inn 



(5.4) 



(5.5) 



for some constants h\,h 2 £ (0,1). Observe that condition (|5.5p implies (|4.6|) and that 
Inn* x Inn (n -> 00). (Here, and in what follows, u(n) x v(n) means that there exist 
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constants C\,C% (0 < C\ < C2 < 00), independent of n, such that < C\v{n) < u(n) < 
C2v{n) < 00 for n large enough.) 

The proof of the minimax upper bounds for the L 2 -risk is based on the following 
two lemmas. 

Lemma 2 Let Assumptions Al and A2 hold. Let the estimators a,j ok and bj k of the 
scaling and wavelet coefficients aj ok and bj k , respectively, be given by (CUSP with f m defined 
by (TO). Then, for all j > j , 



E\a jok - a jok \ 2 < Cn 1 A 1 (j ,n) and K\b jk - b jk \ 2 < Cn 1 A 1 (j,n). (5.6) 
// a\ = ct2 = and ( t5. 5j) holds, then, for any j > jo, 



E\b jk - b jk \ A < Cn A (lnn) 3Al (n*y^+ 



(5.7) 



Lemma 3 Let Assumptions Al, A2 and \5. 5\) hold. Let the estimators bj k of the wavelet 
coefficients bj k be given by 113. 6\) with f m defined by |g.^[ ). Let 



(5.8) 



where c\, K3 and h\ are defined in A8.8\) , |^.5[ ) and \5. 5\) . respectively. Then, for all j > jo 
and any k > 0, 



\b 3 k-b jk \ 2 >{An*)- 1 < n~\ 



(5.9) 



Under Assumptions Al and A2, and using Lemmas [2] and El the following statement 
is true. 

Theorem 2 Let Assumptions Al and A2 hold. Let f n (-) be the wavelet estimator defined 
by \3.9\) . with jo and J given by \5. 1\) (if oi\ = o<2 = 0) or \5. 2\) (if a±a2 > 0) and \i 
satisfying \5. 8\) with k = 5. Let s > 1/p' , 1 < p < 00, 1 < q < 00 and A > 0. Then, under 
h4-(ty if a\oi2 > or ( t5. 5\) if a\ = 0.2 = 0, as n — >■ 00, 



sup E||/ n -/n 2 < <; 

/eB|,,(4) 



C(n*) 2 S +2„+i (lnn) e+ 2H-2»+i , if ai = a 2 = 0, v{2 - p) < ps* , 



C(^) 2s * +2 " (lnn) g+ ^+^ , if at = a 2 = 0, v(2 - p) >ps*, 



k C(\nn*)~~ 



if ai«2 > 0. 



(5.10) 
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Remark 2 Theorems \T\ and [2] implies that, for the L 2 -risk, the wavelet estimator f n (-) 
defined by (|3.9p is asymptotical optimal (in the minimax sense), or near optimal within 
a logarithmic factor, over a wide range of Besov balls Bp q (A) of radius A > with 
s > max(l/p, 1/2), 1 < p < oo and 1 < q < oo. The convergence rates depend on the 
balance between the smoothness parameter s (of the response function /(•))> the kernel 
parameters u,j3,\\ and A2 (of the blurring function g(-,-)), the long memory parameters 
di, I = 1,2..., M (of the error sequence and how the total number of observa- 

tions n is distributed among the total number of channels M. In particular, M and 
di, I = 1,2, ... , M, jointly determine the value of e n which, in turn, defines the "essen- 
tial" convergence rate n* = ns n which may differ considerably from n. For example, if 
M = M n = n e , < 9 < 1 and \g m (ui)\ 2 X \m\- 2 " for every I = 1, 2 . . . , M, then 

M 

£n =M~ 1 J2 N ~ 2dl > ( 5 - n ) 
1=1 

and, therefore, n l ~ 2d < n* < n, where d* = maxi</<^ di, so that, n* can take any 
value between n l - 2d * 0--®) anc l n . This is further illustrated in Section [6] below. 

6 Illustrative Examples 

In this section, we consider some illustrative examples of application of the theory de- 
veloped in the previous sections. They are particular examples of inverse problems in 
mathematical physics where one needs to recover initial or boundary conditions on the 
basis of observations from a noisy solution of a partial differential equation. 

We assume that condition (I2.4p holds true and that there exist 9\ and 62, such that 
M = M n satisfies 

n ei <M<n e2 , O<0i<0 2 <l- (6.1) 
(Note that, under (|57T|) . n 1 ^ 2 < N < n 1 ^ 1 .) 

Example 1 Consider the case when g m (-), m = 0, ±1, ±2, . . ., is of the form 

g m (u) = C g exp (-K\mfq(u) \ , u G U, (6.2) 

where q(-) in ()6.2p is such that, for some q± and q2, 

< qi < q(u) <q 2 <co, u£U. (6.3) 

This set up takes place in the estimation of the initial condition in the heat conduc- 
tivity equation or the estimation of the boundary condition for the Dirichlet problem of 
the Laplacian on the unit circle (see Examples 1 and 2 of Pensky and Sapatinas (2009, 
2010)). In the former case, g m (u) = exp(— 47r 2 m 2 n), u G U, so that K = 4ir 2 , (3 = 2, 
q{u) = u, qi = a and q 2 = b. In the latter case, g m {u) = Cu'" 1 ' = Cexp(— \m\ ln(l/n)), 
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< r% < u < r-i < 1, so that K = 1, ft = 1, = ln(l/u), gi = ln(l/r2) and 
g 2 = ln(l/ri). 

It is easy to see that, under conditions (|6.2p and (|6.3p . for ri(m,n) given in (|4.3p . 

ri(m, n) < C 9 e n exp ^— 2i*r(/i|m|^ and Ti(m,n) > C g e n exp ^— 2i^(72|m-|' 3 ^ , 

where e n is of the form (|5.1ip . Assumptions (|2.4p and (|6.ip lead to the following bounds 
for n*\ 

so that Inn x ln?i*. Therefore, according to Theorems Q] and [21 

i? n (^ 9 (A))x(l n n)-"r. (6.4) 

Note that, in this case, the value of d* has absolutely no bearing on the convergence 
rates of the linear wavelet estimators: the convergence rates are determined entirely by the 
properties of the smoothness parameter s* (of the response function /(•)) and the kernel 
parameter f3 (of the blurring function g(-, •)). 

In other words, in case of super-smooth convolutions, LRD does not influence the 
convergence rates of the suugested wavelet estimator. A similar effect is observed in the 
case of kernel smoothing, see Section 2.2 in Kulik (2008). 

Example 2 Suppose that the blurring function g(-, ■) is of a box-car like kernel, i.e., 

g(u,t) =0.5q(u)I(\t\ <u), u e U, teT, (6.5) 

where q(-) is some positive function which satisfies conditions (|6.3p . In this case, the 
functional Fourier coefficients g m (-) are of the form 

go(u) = 1 and g m (u) = (27rm)~ 1 ^(u) sin(27rmu), ra£Z\{0}, u G U. (6.6) 

It is easy to see that estimation of the initial speed of a wave on a finite interval 
(see Example 4 of Pensky and Sapatinas (2009) or Example 3 of Pensky and Sapatinas 
(2010)) leads to g m (-) of the form (|6.6I) with q{u) = 1. Assume, without loss of generality, 
that u G [0,1], so that a = 0, b = 1, and consider (equispaced channels) ui = l/M, 

1 = 1,2,..., M, such that 

di = a lU i + a 2 , < a 2 < d* < 1/2, < a x + a 2 < d* < 1/2, (6.7) 
i.e., condition (|2.4p holds. Note that if a± = 0, then 

M 

n(m,n) x M^N-^i^m 2 )- 1 ^ sin 2 (2vrm//M), 

i=i 
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which is similar to the expression for ri(m, n) studied in Section 6 of Pensky and Sapatinas 
(2010). Following their calculations, one obtains that, if jo in (|3.9j) is such that 2- 70 > (In n) s 
for some 5 > and M > (327r/3)n 1 / 3 , then, for n and \m\ large enough, 

ri(ra,n) x N~ 2a2 rn~ 2 . 

Assume now, without loss of generality, that a% > 0. (Note that the case of a\ < 
can be handled similarly by changing u to 1 — n.) Below, we shall show that, in this case, 
a similar result can be obtained under less stringent conditions on M = M n . Indeed, the 
following statement is true. 



Lemma 4 Let <?(•,•) ^ e °f the form l\6.5\) , where q(-) is some positive function which 
satisfies §6. 3\) . and let di, I = 1, 2, . . . , M, be given by ( (<?. 7| ) wii/i oj > 0. Assume (without 
loss of generality) that U = [0, 1], and consider u\ = l/M, 1 = 1,2,..., M . Let M = M n 
satisfy l6l\) with 0i > if m > and M > (327r/3)n 1 / 3 ?/ =0. Lf m <E Aj, where 
\Aj\ = C m 2 3 , for some absolute constant C m > 0, with j > jo > 0, where jo is such that 
20o > (7 inn for some C$ > 0, then, for n and \m\ large enough, 

n(m,n) x N~ 2aa m~ 2 (log n)* 1 . (6.8) 

It follows immediately from Lemma |4] that, if 

M = M n = n e , < 9 < 1, 

then Assumption A2 holds with a± = ct2 = 0, v\ = v% = v = 2, e n = n~ 2a2<yl ~ e ' > (Inn)" 1 
and Ai = A2 = 0. Note that e n satisfies conditions (|4.6p and (|5.5p . so that Inn x Inn*. 
Therefore, according to Theorems [1] and O 




A-2p> ps*, 



and 



12 s 



where 
and 



C{n*) 2s + 5 (lnn) e , if 4-2p<ps*, 
sup E\\f n -f\\ 2 <\ ^ (6.10) 

f^ q {A) c(^) s+2 (lnn) e , if 4-2p>ps*, 



(Inn)" 1 



n = n 



l-2a 2 (l-6>) 



( (5(2-p)+ -r A r, 4 

p(2s+5) ' 11 4-2p<ps 

if 4-2p = ps* 

0, if 4 - 2p > ps* 



Note that LRD affects the convergence rates in this case via the parameter 02 that 
appears in the definition (|6.7|) . 
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7 Discussion. 



Deconvolution is the common problem in many areas of signal and image processing which 
include, for instance, LIDAR (Light Detection and Ranging) remote sensing and recon- 
struction of blurred images. LIDAR is a laser device which emits pulses, reflections of 
which are gathered by a telescope aligned with the laser (see, e.g., Park, Dho & Kong 
(1997) and Harsdorf & Reuter (2000)). The return signal is used to determine distance 
and the position of the reflecting material. However, if the system response function of 
the LIDAR is longer than the time resolution interval, then the measured LIDAR signal is 
blurred and the effective accuracy of the LIDAR decreases. If M (M > 2) LIDAR devices 
are used to recover a signal, then we talk about a multichannel deconvolution problem. 
This leads to the discrete model (jl.ip considered in this work. 

The multichannel deconvolution model (jl.ip can also be thought of as the discrete 
version of a model referred to as the functional deconvolution model by Pensky and Sap- 
atinas (2009, 2010). The functional deconvolution model has a multitude of applications. 
In particular, it can be used in a number of inverse problems in mathematical physics 
where one needs to recover initial or boundary conditions on the basis of observations 
from a noisy solution of a partial differential equation. Lattes <fe Lions (1967) initiated 
research in the problem of recovering the initial condition for parabolic equations based 
on observations in a fixed-time strip. This problem and the problem of recovering the 
boundary condition for elliptic equations based on observations in an internal domain 
were studied in Golubev & Khasminskii (1999); the latter problem was also discussed in 
Golubev (2004). Some of these specific models were considered in Section [6l 

The multichannel deconvolution model (II. ip and its continuous version, the func- 
tional deconvolution model, were studied by Pensky and Sapatinas (2009, 2010), under 
the assumption that errors are independent and identically distributed Gaussian random 
variables. The objective of this work was to study the multichannel deconvolution model 
(jl.ip from a minimax point of view, with the relaxation that errors exhibit LRD. We were 
not limited our consideration to a specific type of LRD: the only restriction made was that 
the errors should satisfy a general assumption in terms of the smallest and larger eigen- 
values of their covariance matrices. In particular, minimax lower bounds for the L 2 -risk 
in model (jl.ip under such assumption were derived when /(•) is assumed to belong to a 
Besov ball and <?(•,•) has smoothness properties similar to those in Pensky and Sapati- 
nas (2009, 2010), including both regular-smooth and super-smooth convolutions. In addi- 
tion, an adaptive wavelet estimator of /(•) was constructed and shown that such estimator 
is asymptotically optimal (in the minimax sense), or near-optimal within a logarithmic 
factor, in a wide range of Besov balls. The convergence rates of the resulting estimators 
depend on the balance between the smoothness parameter (of the response function /(■)), 
the kernel parameters (of the blurring function g(-, •)), and the long memory parameters 
di, I = 1,2 ... , M (of the error sequence and how the total number of observations 
is distributed among the total number of channels. Note that SRD is implicitly included 
in our results by selecting d\ = 0, I = 1, 2, . . . , M. In this case, the convergence rates 
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we obtained coincide with the convergence rates obtained under the assumption of inde- 
pendent and identically distributed Gaussian errors by Pensky and Sapatinas (2009, 2010). 

Under the assumption that the errors are independent and identically distributed 
Gaussian random variables, for box-car kernels, it is known that, when the number of 
channels in the multichannel deconvolution model (jl.ip is finite, the precision of recon- 
struction of the response function increases as the number of channels M grow (even when 
the total number of observations n for all channels M remains constant) and this requires 
the channels to form a Badly Approximable (BA) M-tuple (see De Canditiis and Pensky 
(2004, 2007)). Under the same assumption for the errors, Pensky and Sapatinas (2009, 
2010) showed that the construction of a BA M-tuple for the channels is not needed and 
a uniform sampling strategy for the channels with the number of channels increasing at a 
polynomial rate (i.e., u t = l/M, I = 1, 2, . . . , M, for M = M n > (32vr/3)n 1 / 3 ) suffices to 
construct an adaptive wavelet estimator that is asymptotically optimal (in the minimax 
sense), or near-optimal within a logarithmic factor, in a wide range of Besov balls, when 
the blurring function g{-, •) is of box-car like kernel (including both the standard box-car 
kernel and the kernel that appears the estimation of the initial speed of a wave on a finite 
interval). Example [2] showed that a similar result is still possible under long-range depen- 
dence with (equispaced channels) ui = l/M, I = 1,2, .. . , M, n 6 * 1 < M = M n < n 9 ' 2 , for 
some < 6i < 9 2 < 1 when di = aiu t +a 2 , I = 1, 2, . . . , M, < a 2 < 1/2, < ai+a 2 < 1/2. 

However, in real-life situations, the number of channels M = M n usually refers to 
the number of physical devices and, consequently, may grow to infinity only at a slow rate 
as n — > oo. When M = M n grows slowly as n increases, (i.e., M = M n = o((lnn) Q ) for 
some a > 1/2), in the multichannel deconvolution model with independent and identically 
distributed Gaussian errors, Pensky and Sapatinas (2011) developed a procedure for the 
construction of a BA M-tuple on a specified interval, of a non-asymptotic length, together 
with a lower bound associated with this M-tuple, which explicitly shows its dependence 
on M as M is growing. This result was further used for the derivation of upper bounds 
for the L 2 -risk of the suggested adaptive wavelet thresholding estimator of the unknown 
response function and, furthermore, for the choice of the optimal number of channels M 
which minimizes the L 2 -risk. It would be of interest to see whether or not similar upper 
bounds are possible under long-range dependence. Another avenue of possible research 
is to consider an analogous minimax study for the functional deconvolution model (i.e., 
the continuous version of the multichannel deconvolution model (II. ID ) under long range- 
dependence (e.g., modeling the errors as a fractional Brownian motion) and examine the 
effect of the convergence rates between the two models, similar to the convergence rate 
study of Pensky and Sapatinas (2010) when the errors were considered to be independent 
and identically distributed Gaussian random variables. 

8 Proofs 

8.1 Proofs of the Statements in Section [2] 

Proof of Lemma [Tl We prove the upper bound only since the proof of the lower bound 
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is similar. By (12. If l- (j2.2p . and the definitions of and J~d, 

A m ax(S) < C max sup [ W h(X)\X\~ 2d dX = 2C max sup r h(X)\X\- 2d d\. 

heH N J-ir h&H N JO 

Now, we split = Jq + J^/ N - Since d < 1/2, for the first integral, we have 

[ n/N h(X)\X\- 2d dX < N X- 2d dX = N^— (*y 2d+1 = _J_iV 2d . 
Jo " Jo l-2d \NJ I -2d 

For the second integral, since d > 0, we have 

MA^Ar^A < (^y 2d r h{x)dx < (^y 2d r Va^a < ^^-^n^. 

tt/N Jn/N V A'/ Jo 

This completes the proof of the lemma. □ 

8.2 Proof of the Minimax Lower Bounds for the L 2 -Risk 

In order to prove Theorem [TJ we consider two cases: the dense case and the sparse case, 
when the hardest functions to estimate are, respectively, uniformly spread over the unit 
interval T and are represented by only one term in a wavelet expansion. 

The proof of Theorem[T]is based on Lemma A. 1 of Bunea, Tsybakov and Wegkamp (2007) , 
which we reformulate here for the case of the L 2 -risk. 

Lemma 5 [Bunea, Tsybakov, Wegkamp (2007), Lemma A.l] Let Q be a set of functions 
of cardinality card(£l) > 2, such that 
ft) \\f-g\\ 2 >A5 2 forf,gen, f + g, 

(ii) the Kullback divergences K(Pf,P g ) between the measures Pf and P g satisfy the in- 
equality K(P f ,P g ) < log(carrZ(0))/16 for f,g€to. 
Then, for some absolute constant C > 0, one has 

inf sup E f \\T n - /|| 2 > Cb 2 . 

The dense case . Let u be the vector with components u>k = {0, 1}. Denote the set 
of all possible vectors u> by Q = {(0, l) 2 ^}. Note that the vector u has N = 2^ entries 
and, hence, card(O) = 2 H . Let H(u>,u>) = Y^^o^-i^k 7^ <^k) be the Hamming distance 
between the binary sequences u and G). Then, the Varshamov-Gilbert Lemma (see, e.g., 
Tsybakov (2008), p. 104) states that one can choose a subset Q± of fi, of cardinality at 
least 2 N / 8 , such that H(u),u) > for any u,u 6 

Consider two arbitrary sequences u,u) £ J7i and the functions fj and fj given by 
= Pi ^2 ^kipjkit) and fj(t) = pj ^ ib k vb jk (t), t £ T. 

k=0 k=0 
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Choose pj = A2~ j ( s+1 / 2 \ so that fjjj G B s pq (A). Then, calculating the L 2 -norm differ- 
ence of fj and fj, we obtain 



2*-l 

Wfj - /if = Pill E " w *)^fc|| 2 = p 2 jH(u>,u) > 2?fij/8. 

k=0 

Hence, we get 4<5 2 = 2^ p 2 /8 in condition (i) of Lemma [5l 

In order to apply Lemma [5j one needs to also verify condition (ii). For f& with 
u> £ £1, denote by hiu and £>, the vectors with components, respectively, 

hu(ui,ti) = g(ui,U - •) * /«(•), i = 1,2,... ,7V, 
hu(ui,ti) = g(u h U - •) * /&(•), i = 1,2,... ,JV. 

Then, 

,1/ 



i=i 

M 

< 0.5^A max ((sW)- 1 )||h^-h^|| 2 . 



Now, since uj and a) are binary vectors, using Plancherel's formula, (|8.ip . and the fact 
that \ipjk,m\ < 2~^ 2 , we derive that, under Assumptions Al and A2, 



mGCj i=l 

< 27riTf 1 n2V 2 Ai(j,n) < 2vr^ 2 A"f 1 n2 _2j ' s Ai(j',n), 

where Ai(j,n) is defined by (|4.4p . 

Direct calculations yield that, under Assumptions Al, A2 and (I4.6p . for some con- 
stants C3 > and C4 > 0, independent of n, 

' ess- 1 2 2 ^, if ai = a 2 = 0, 

Ai(j»<J (8.1) 
c 4 e" 1 2 2 ^'j A2 exp |ai 2^} , if aia > 0. 

Apply now Lemma [5] with j such that 

2^ 2 K 1 ~ 1 n2- 2js Ai(j,n) < 2 J ' In 2/16, 
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i.e., 

2 j _ j [n*(lnn*)~ A2 ] ^+L+i ; jf = 0, 
1 (lnn*) 1 ^, if /3 > 0, 



to obtain 



^ i [n*(lnn*)- A 2] if /? = 0, fg 2) 

\ (lnn*)" 2 ^, if /3 > 0. 

The sparse case. Let the functions /,• be of the form fj(t) = pjif)ji-(t), t &T, and denote 

n = {/;(*) = prfjkV) : fc = o, i, . . . ,2» - 1, /o = o}. 

Thus, card(r2) = 2 J . Choose now /)j = A2~i s ' , so that /j G It is easy to check 

that, in this case, one has AS 2 = p 2 in Lemma and that 

K(P m P a ) < 2nA 2 K^ 1 n2- 2 ^ A x (j,n). 

With 

2 j _ f [n*(lnn*)- Aa_1 ] 2^ , if /3 = 0, 
I (Inn*) 1 ^, if /3 > 0, 

we then obtain that K(P W ,P^) < 2t: A 2 K^ x n2~ 2 ^' A x (j,n) and 



5 2 



" 2s' +2v 



(lnrt*) A 2 + 1 

(lnn*)- 2s '/>, if/3>0. 



if Z 3 = °. (8.3) 



Recall that s* = min{s, s'}. By noting that 

2s/ {2s + 2u + 1) < 2s* /(2s* + 2u), if i/(2 - p) < ps* , (8.4) 



we then choose the highest of the lower bounds in (18.2p and (|8.3p . This completes the 
proof of the theorem. □ 

8.3 Proof of the Minimax Upper Bounds for the L 2 -Risk. 

We start with proofs of Lemmas [2] and [3l 

Proof of Lemma H First, consider model (jl.ip . Then, using (j3.2j) . ()3.4p . (|3.6p 
and (13.71) . one has 



mGCj 



E /--/- ?55S. = E & 



where 



fm-fm = ^= (j2 N ~ 2dl 9rn^l>lm \ I N~ 2d > |</ ro («,)| 2 j . (8.5) 
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Consider vector with components 



V® = N- 2d ^ mjk g m ( Ul ) 



M 



J2N- 2dl \g m (ui) 



1=1 



It is easy to see that, due to \if) m jk\ < 2 J / 2 and the definition of Cj, 



M 



< 47T|C J |- 1 N-^ Y \9mM\' 



Y,N- 2dl \g m {ui)? 
1=1 

M 

Y N ~ 2dl \9m(ui) 



1=1 



Define 



Hence, 



M 



= Y / N- 2d '\g m (u l )\ 2 = MT 1 (m,n) . 



l=i 



|V«|| 2 <47r|C J -r 1 N- 2d >N- 2d i Y, \9m(ui) 



Using Assumption Al, since zi m are independent for different I's, we obtain 

M 

1 V- - 

Ojk - Ojk\ 



El h <>, - b; k \ 2 = — Y ^ mi jk^m 2 jkY N 4d ' V rn\ V ml9mi(ui)g m2 (ui)Cov(zi mi ,Zi m2 ) 



m\,m,2&Cj 
M 



=1 



TV ^ 

z=i 

< Vax(^ (0 )||VWf 



1=1 



M 



< 47rK 2 \C J \- 1 N- 1 YN- 2dl Y \9m( 

1=1 m£Cj 



ui)\ 2 v m 2 



M 



= AnKilC^N- 1 Y v m 2 YN' 2d '\ 9m M\ 2 = ^K 2 \C J \- 1 N- 1 Y «m\ 



so that 



E|6 jfe -6 jfc | 2 < Cn- 1 ^ 1 Y [ri{m,n)}- 1 . 
(One can obtain an upper bound for E|aj £; — aj k\ 2 by following similar arguments.) 
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In order to prove (15.7|) . define 



B, = N 



-2d t 



M 



1=1 



Note that 



^ (zi mi Zl m2 Zi m3 Zl m4 ) < [lI^ =1 E|2: mi /| 4 ] 1//4 



Consequently, using Assumption Al, the fact that Z[ m are independent for different I's, 
and that E|z m ;| 4 = 3 [E|z m /| 2 ] for standard (complex-valued) Gaussian random variables 
z m i, one obtains 



E\b jk -b jk \ 4 = 0\N 2 Y^Bf 



+ O 



i=i 

M 



\^mjk\\9m 2 (ui)\ (E| 



4x1/4 



N 1 ^2 B l E ^mijk^m2jk9mi(ui)9rn(ui)Cov(zi mi ,Zi m2 ) 
1=1 mi,m2&Cj 

2 



M 



o\n- 2 Y b 



=1 



Y \^rnjk\ 2 \9m{ui)\ 2 ^ E| 



Zmi 



mgC, 



m£Ci 



o 



n 



rneCj 



Since Xlmgc E km/| 2 = 0(|Cj|), one derives 



1 T 2 (m,n) 



M 3 [n(m,n)] z 



+ 



A 2 (j ; 



= 0(M- 3 A 2 (j,n) + n- 2 A 2 (j,n)) . 
It is straightforward to show that, when a\ = a 2 = 0, one has 

A 2 (j,n) = 0(2 6 ^j 3Al ^ 3 ). 
Thus, using flHU) and the fact that 2? < 2 J_1 < (n*) 1/(2l,+1) , jS3]) can be rewritten as 

Efc-6jfc| 4 = 0(2 6 ^j 3Al ^ 3 M- 3 + 2 4 ^j 2Al ^- 2 
= 0(n 3 (lnn) 3Al (n*)- 3/{2iy+1) 

Hence, (I5.7p follows. This completes the proof of the lemma. 



(8.6) 



□ 
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Proof of Lemma [3]. Consider a set of vectors 







feet/,-. 



and a centered Gaussian process 



Note that 



supZj>(i;) = Y fo'* ~~ ft i fc l 2 - 

We shall apply below a lemma of Cirelson, Ibragimov and Sudakov (1976) which 
states that, for any x > 0, 



Pr Y fijk-b jk \ 2 >(x + Bi) \ < 



cxp 



where, 



Si = E 



Y hk-b 



'jk\ 



keUr 



< 



n 



with ci defined in (|8.ip . and 

= sup Var(Zj>(v)) = sup E| Y UfcCfyfc - 



Denote 



feet/,-. 



z=i 



, m £ Cj. 



17) 



Then, under Assumption A2 with a% = Q2 = 0, using argument similar to the proof of 
(15.61) . one obtains 



#2 



sup 



N 1 W jm 1 Wj m2 E 



M 



Y N U ' 1 9raA u l)9ra 2 {ui)Zl m ^l m 2 



1=1 



M 



< sup A" 1 ^A- 4d 'A max (£«) ^ |^ m5m (nOp 



< Kan" 1 sup J Y \w jm \ 2 [Ti{m,n)]- 1 \ < 4vrC 3 *2 2 ^j Al 



J r I mGC*,- 
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where C 3 * = (if 3 ) -1 (lii 2) Al {2tt/S) 2u . 

Apply now inequality (18, 7p with x such that x 2 = 2B2Kha.n, and note that 
{x + B^ 2 = {n*)- l 2 2jv j Xl Inn (^/ci + \J SituR^ 1 (\n2) x i (2tt /3) 2u ^J 

and 

[i 2 > 4(1 - hi)' 1 ^^1+ ^/8vr/€^ 1 (ln2) A i(2^/3)^^ , 

which guarantees (|8.7|) . This completes the proof of the lemma. □ 

Proof of Theorem [2l Direct calculations yield that under Assumptions Al, A2 
and (|5.5p . for some constants c\ > and c 2 > 0, independent of n, one has 



Ai(i,n) < { 



ci 



e" 1 2 2 ^'j Al , if ai = a 2 = 0, 



\°-°) 

c 2 e" 1 2 2 ^'j Al exp {o<i (^-f 2^} , if o^a > 0. 

Using 18. 8| the proof of this theorem is now almost identical to the proof of Theorem 2 in 
Pensky and Sapatinas (2010). This completes the proof of the theorem. □ 

8.4 Proofs of the Statement in Section [61 

Proof of Lemma [4] Below we consider only the case of a\ > 0. Validity of the satement 
for a% = follows from Pensky and Sapatinas (2010). 
By direct calculations, one obtains that 

M 

n(m,n) = M- 1 (47r 2 m 2 )- 1 iV- 2a2 ^q 2 {l/M) sm 2 {2tt ml M - 1 ) N - 2ail ' M . 

1=1 

Therefore, 

(4x 2 m 2 )~ l ql N- 2a2 S{m, n) < n(m, n) < (4vr 2 m 2 )^ 1 ^ A^ 2a2 S(m, n), (8.9) 

where 

M 

S(m,n) = M~ 1 Y,sm 2 (2TrmlM- 1 )N- 2a ^ M . 
1=1 

Denote p = _/\r~ 2ai//M , x = inniM -1 and note that, as n — > oo, 

p M = N -2ai _^ o 

and 

p = exp(-2aiM -1 lnJV) 

= I - 2a 1 M~ 1 In N + 2a 2 M- 2 In 2 N + o(M- 2 In 2 N), (8.10) 
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since M 1 In ./V — > as n — > oo. 



Using the fact that sin 2 (x/2) = (1 — cosx)/2 and formula 1.353.3 of Gradshtein Sz 
Ryzhik (1980), we obtain 



S(m, n) 



1 

M 



1 -p M 1 - pcosx - p M cos(Mx) +p M+l cos((M - l)x 

1 — 2p cos x + p 2 



1 — p 

Since m is an integer and x = AtttuM^ 1 , 

cos(Mx) = 1, sin(Mx) = 0, cos((M — l)x) = cosx. 

Therefore, simple algebraic transformations yield 

„, ^ p(p + 1)(1 - p M )(l - cosx) 

(m,nj ~ M(l - p) [(1 - p) 2 + 2p(l - cos x)] 

The asymptotic expansion (18. 1Q[) for p as n — > oo, leads to 

(1 - p M ) _ \-N- 2ai 

M(l-p) ~ 4oi lniV(l -aiM-MniV)' ^' 
so that, if is large enough, due top < 1, one obtains an upper bound for S(m,n): 



111) 



S{m, n) 



(l-p M ) 
M(l-p) 



(1-pf 



+ 



p{p + 1)(1 — cos x) p+1 



< 



1 



2ai IniV 



i.12) 



In order to obtain a lower bound for S(m, n), we note that for N large enough, one 
has 1/2 < p < 1. Consider the following two > 7r/3 and x < 7r/3. If x > n/3, 

then cosx < 1/2 and 



F(p, x) 



(1-P) 2 



+ 



<2, 



p(p + 1)(1 — cos x) p+1 
If x < 7r/3, we can use the fact that 1 — cosx = 2sin 2 (x/2) > 3x 2 /8, so that 



F(p, x) < 



1 + 



3x 2 



< 



1 + 



2a 2 In 2 N 
2>TT 2 m 2 



for N large enough. 



Since \m\ = C m 2 : ' > C m Colnn for some 5 > and Inn > (1 — 6\) 1 lnA r due to 
assumption (|6.ip . one has m 2 > C m C (l - fli)" 1 In 2 N and 



5(m,n) > C(lniV)- 1 . 
Observe now that In TV x Inn. This completes the proof of the theorem. 



U3) 
□ 
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